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The unbinding process of a protein-ligand complex of major biological interest was 
investigated by means of a computational approach at atomistic classical mechanical 
level. An energy minimisation-based technique was used to determine the dissocia- 
tion paths of the system by probing only a relevant set of generalized coordinates. 
The complex problem was reduced to a low-dimensional scanning along a selected 
distance between the protein and the ligand. Orientational coordinates of the es- 
caping fragment (the ligand) were also assessed in order to further characterise the 
unbinding. Solvent effects were accounted for by means of the Poisson-Boltzmann 
continuum model. The corresponding dissociation time was derived from the calcu- 
lated barrier height, in compliance with the experimentally reported Arrhenius-like 
behaviour. The computed results are in good agreement with the available experi- 
mental data. 



I. INTRODUCTION 

Biological processes are driven by interactions between the molecular components of cel- 
lular machinery, commonly between proteins and their target molecules (genericaliy termed 
ligands). Most of these processes portray a cascade of protein-ligand association/dissociation 
events, and thus, knowledge and control of their energetics and kinetics is of key importance 
in molecular biology, proteomics, clinical diagnosis, and therapeutic research, to name a few. 

Protein-ligand dissociation is, in essence, a fragmentation of complex multi-atomic ag- 
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gregates. Many-body aggregates are very ubiquous in Nature, and have been the object of 
extensive experimental and theoretical studies in a wide range of natural science research 
fields: examples range from nuclear fission to atomic clusters fragmentation to dissociation 
of insulin from its receptor on the cell membrane, etc. A vast amount of data has now 
been accumulated, but there is still a need for an efficient and physically sound theoretical 
approach that could possibly rationalize these data and make insightful predictions, the 
applicability of one such approach being obvious. A first step is to try and identify the 
common features underlying dissociation events of different nature. 

Clustering and fragmentation processes in nuclear and atomic cluster physics have already 
been found to possess many features in common (for a comprehensive review see ref. 
The emerging key idea is that those processes can be successfully described in terms of a 
few collective coordinates that define the overall geometry configuration of the escaping and 
parent fragments The same basic concept also holds for similar processes in more 

complex systems, like the fragmentation of a dipeptide [3]. On the basis of this principle, 
the present paper addresses the dissociation process of an aggregate of higher complexity, a 
biological protein-ligand adduct (often referred as a complex). 

A most remarkable protein-ligand system is the antibody-antigen one, which is involved 
in a fundamental recognition process during the body immune response. This response is 
triggered by foreigner molecules - the antigens ( anti body generator, AG). One key mecha- 
nism whereby the immune system recognizes and targets them for destruction is by releasing 
antibodies ( anti- foreign body, AB) jjj. ABs are very large proteins, and the human body has 
a potential repertoire of 2.5 xlO 11 different ones. Yet, they all feature a basic scaffold: they 
consist of two identical "light" (L) and "heavy" (H) chains of amino acids entangled in a 
Y-shape fold as shown in Figure [TJ Each tip of the Y branches displays a distinctive variable 
region, i.e., the specific "lock" for which the target AG has the "key" (see the schematic 
inset in Figured]); the two tips are identical for each AB. The "key" region of the AG can 
be a small protein fragment or a hapten. A hapten is a low molecular weight compound 
originally attached to some carrier protein, that will also trigger the release of ABs. Upon 
exposure to a particular AG, a set of ABs is refined to target it, via a mutation process 
a a. The mutations occur in the referred variable region (hence it is called "variable"). 
Along a maturation series, the increase in affinity strongly correlates with an increase in the 
corresponding AB-AG dissociation times, r Usually, r is expressed in terms of 




FIG. 1: Overall ribbon representation of a complete AB structure. The two pairs of heavy chains are 
depicted in red and blue, and the corresponding light chains in yellow and grey. The dashed ellipse 
highlights one of the branches that bind to the antigen (the so called Fab, after fragment binding 
antigen), in an all-atom representation; the trapezoidal region puts in evidence the Fab variable 
domains (with added hydrogens), and the dashed arc illustrates the chains' cleavage sections for 
these variable domains to be detached. A simplified scheme of AB-AG binding is presented in the 
inset. 



the rate of spontaneous dissociation, k s — 1/r. 

Not surprisingly, much effort has been devoted to the determination of those k Q g values, 
with some of the most innovative experiments involving sensitive micromanipulation tech- 
niques like atomic force microscopy (AFM) and other force probe procedures to measure 



AB-AG binding forces 
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Il2l . [13I . Some further insight into the molecular struc- 



ture, interactions and unbinding pathways underlying such single molecule experiments has 
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been gained from computer simulations using "force probe" molecular dynamics (FPMD) 
JjJ]. However, the question arises of to what extent the measured unbinding force in the 
mechanically speeded up process of pulling out the ligand relates to the thermodynamic or 
kinetic parameters describing the spontaneous dissociation. The later arises in the minute 
time scale js] in contrast to the time scales of AFM (millisecond) and FPMD (nanosecond). 
There is also the matter of across which pathway is unbinding being forced. 

In the absence of a pulling force, one regains the spontaneous (natural) mode of AB-AG 
dissociation, a thermally activated barrier-crossing along a preferential path in a multidi- 
mensional energy landscape. The contributing activated states (which determine k g) may 
well be described in terms of a few collective coordinates, in close analogy to other studied 

n n n 

fragmentation processes [l|, |2|, |3|. Within this context, it is reasonable to constrain the many 
other degrees of freedom that only contribute to the negligible fine structure of the energy 
landscape. This is a rational approach to probe the unbinding of a complex biological system 
like the AB-AG one, in order to calculate the corresponding energetic barrier and derive k Q s 
from it. 

Starting with an experimentally well studied AB-AG complex, an anti-fluorescein one 
(vide infra), here we describe a computational approach at molecular (atomistic) level to 
explore its preferential unbinding pathways by probing only a few relevant degrees of free- 
dom. A detailed analysis of its dissociation pathway and dependence on the distance and 
relative orientation of the molecules in question is presented. The introduction of solvent 
effects is also discussed along with its implications on the results, and the dissociation rate 
(k Q s) is derived from the calculated energy barriers. Following this introduction, the selec- 
tion of the AB-AG system is described in detail. Next, a brief overview of the theoretical 
methods adopted in this study is given, in particular the computational level, the force 
field and the extent to which the solvent effects have been introduced. In section [IV] the 
results are presented, compared with the available experimental data, and discussed. The 
last section is devoted to the conclusions. 



II. THE TEST CASE 



Fluorescein (Flu) is a synthetic hapten. It is extensively used in fluorescence-based kinetic 
measurements of off-rates (k Q s) 15], and a valuable reference system for the understanding 
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of important immunological issues. Anti-fTuorescein AB-AG complexes are also clear-cut 
models in the sense that Flu is a small inert and rigid ligand (see Figure [2]) and the off-rates 
of a number of anti-Flu complexes have been found to display an Arrhenius-like behaviour 

The current study has been carried out for the anti-ffuorescein IgG monoclonal antibody 
4-4-20 (mAb4-4-20), one of the most extensively studied by thermodynamic, kinetic, struc- 
tural, spectroscopic, and mutational methods ([3], and references within), and for which 
two crystallographic structures of its Fab region (highlighted in Figure [1]) have already been 



reported 13, ll8( • A complete IgG mAb4-4-20 molecule has two identical Fab fragments, each 
consisting of two constant and two variable domains. The two variable domains (labelled 
Vl and Vh) constitute the so called Fv fragment (also highlighted in Figured]), which is 



the minimal antigen-binding fragment. In 
that feature only the Vl and V h domains 



act, there are many genetically engineered ABs 



19| . This practice further endorses the idea of a 



system with a restricted number of binding-determinant degrees of freedom. It also makes 
it realistic (and computationally less demanding) to consider just the mAb 4-4-20 variable 
domains: Vl with 112 amino acids and V h with 118. 



III. METHODS 
A. Force field 



Even reducing the system to the mAb4-4-20 two variable domains plus Flu, it amounts to 
ca. 3600 atoms. It is, thus, too big to be computationally addressed at any level of quantum 
mechanics. A realistic simplification is to assume that the nuclei move in the average field 
created by all particles, and use an empirical fit to this field - an effective potential commonly 
known as a force field. One then uses the computationally less demanding classical mechanics 
formalism to calculate both static properties (equilibrium structures, relative energies, etc.) 
and the time evolution of the system. 

Much effort has been devoted to develop force fields suitable for studying proteins, the 
CHARMM force field j^l (the one used in the present work) being one of the most widely 
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FIG. 2: Structural formula and assigned atom labels for Fluorescein {2-(6- hydroxy-3-oxo-(3H)- 
xanthen-9yl) benzoic acid}. The force-field atom types (see sub-section UUCP and the partial 
charges (units of e) are listed in the table underside. The dashed line puts in evidence the two 
aromatic (ring) fragments labelled and grouped in the table. 
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The energy £" is a function of the positions of all atoms in the system. The first four sum- 
mations are known as bonding terms: they extend to the topologically defined N r covalent 
bonds V, Ng bond angles l 6\ dihedral angles '0' and N x improper torsion angles 
respectively. For some specific bond angles, an additional bonding term may be required as 



a function of the distance 'S" between the first and third atoms [20(. The term for the bonds 
describes the energy required to deform a bond from its equilibrium value (denoted by the 
subscript '0'), within a harmonic approximation, and an analogous description holds for the 
remaining harmonic terms; the dihedral term is chosen differently to satisfy the dihedral 
periodicity. The last two summations of equation (13.11) are extended to all N non-bonding 
atom pairs ij separated by three or more covalent bonds. The Lennard- Jones 6-12 potential 
term accounts for the van der Waals (vdW) interactions: for each atom type (say i), there 
is a Ri distance corresponding to a well depth e^, with i?j = 2 1//6 <7j and cr, the distance 
for which the Lennard- Jones potential equals zero; the Lennard- Jones parameters between 
pairs of different atoms are obtained from combination rules, the e^- values based on the 
geometric mean of q and 6j and Rij values form the arithmetic mean between Ri and Rj. 
The Coulombic potential is defined for the pairs of charges qt and qj separated by a distance 
rij, and for a given dielectric constant e (the vacuum one by default). The equilibrium values 
in the harmonic terms and the Ri and e« values are parameters derived from experimental 
data {e.g., crystallographic structures) and ab initio quantum mechanical calculations on 



small reference molecules, presented and discussed in ref. 20[j. Partial charges (g«, qj) are 
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also derived from such ab initio calculations. 



B. Implicit solvent 

Proteins operate in aqueous solution. Solvation, stability and dissociation of molecules 
in water are largely governed by electrostatic interactions. This is particularly pertinent in 
proteins: more than 20% of all amino acids in globular proteins are ionized under physiolog- 



ical conditions and polar side-chains occur in over another 25% amino acids [21] . However, 
introducing explicit water molecules in a computational simulation to account for solvent 
effects dramatically increases the calculation time. Moreover, when the calculations involve 
any energy minimisation-based technique like calculating minimum energy reaction paths, 
the explicit water molecules will arrange in a single conformation matrix, exerting forces on 
the solute that are very different from the solvent mean force. 

Alternatively, a continuum treatment of the solvent as a uniform dielectric may provide 
an accurate enough description of such interactions, as long as one accounts for the fact 
that a protein in aqueous solution (the physiological medium) yields a system with two 



very different dielectric media 221 ] . A most physically correct implicit solvent model arises 



from solving the so-called Poisson-Boltzmann (PB) equation (see [221 . |23| . and references 
within). The protein (macromolecule) is treated as a low-dielectric cavity bounded by the 
molecular surface and containing partial atomic charges - typically taken from the classical 
molecular mechanics force field. The solvent (water) is implicitly introduced by assuming a 
high-dielectric surrounding of the protein. And since under physiological conditions macro- 
molecules are dissolved in dilute saline solutions (water with a dissolved electrolyte), a term 
for the average charge density due to the mobile ions is also included. This classical con- 
tinuum electrostatics treatment relies on the (reasonable) assumption that it is possible to 
replace the ionic potential of mean force with the mean electrostatic potential, neglecting 
non-Coulombic interactions {e.g., vdW) and ion correlations. The actual PB equation reads: 

V ■ [e(r)Vp(r)] = -47rp(r) (3.2) 

N 

- 47ry^eq^(r)A(r), 

i=l 

with 

rii(r) = n°exp(eqi^(r)/fc B T). (3.3) 
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Equation (13.21) relates the electrostatic potential <p to the distribution of the protein atomic 
partial charges (charge density p), the dielectric properties of both the protein and solvent 
(e, position dependent (r)), and the charge density due to the mobile ions given by the 
summation term; qi is the charge of ion type i, nj(r) its local concentration, e the elementary 
charge and A(r) a parameter that describes the ions' accessibility at position r. The boundary 
condition is <p(oo) = 0. For each ion type, rij(r) is described by a Boltzmann distribution 
(13. 3p where n° is the ion's concentration in bulk solution, ks the Boltzmann constant and T 
the absolute temperature. As for the accessibility parameter, a general consensus is that any 
point within one ionic radius from the macromolecule is inaccessible {i.e., A(r) = 0), and it 
is implicit that the region inside the macromolecular surface is inaccessible. The remaining 
region outside has A(r) = 1. A typical value for the ionic raidus is the one of Cl~ (2 A), 
considering that Na + Cl~ (sodium chloride) is a most frequently chosen electrolyte. 

A description of the several possible approximations and numerical techniques used to 
solve equation ( 13. 2ft is beyond the scope of the present paper, the reader being referred to 
the supporting literature of the software used in this work, APBS (Adaptative Poisson- 



Boltzmann Solver) 



23 



24j . Briefly, the solute's charges are mapped onto a mesh and the 
electrostatic potential in the presence of the dielectric continuum solvent is determined at 
each point, via a finite difference numerical solution of the PB equation. The mesh being a 
finite one, it is necessary to set up the boundary potentials (at the lattice edge) accordingly. 
For the present work, they are approximated by the sum of the Debye-Hiickel potentials of 



all the charges, meaning 

exp(-n/A D ) 
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^ water 
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C. Fluorescein parameters 

The available CHARMM parameterisation already has parameters for all amino acids but 
not for fluorescein, so one has first to describe the later consistently with the force field. In the 
present work, the required bonding and Lennard- Jones parameters where derived by analogy 
to similar ones existing in CHARMM. Partial atomic charges were fitted to reproduce the 
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molecular electrostatic potential (ME 



the Merz-Singh-Kollman scheme 



2a 



') at selected points around the molecule according to 



261 ] implemented in the Gaussian03 program 27| . The 



points are located in layers around the molecule, the first layer corresponding to the van der 
Waals molecular surface scaled by a factor of 1.4; the default scheme then adds three more 
layers with scaling factors 1.6, 1.8 and 2.0. The MEP was generated at the DFT/B3LYP 
level with the 6-31G(d) basis set. DFT (density functional theory) is now a widely used 
and computationally convenient quantum mechanical method well documented in many 
reference books (e.g., 281]): it makes use of exchange correlation functionals dependent on 
the electron density and its gradient to tackle electron correlation effects. B3LYP was the 
functional of choice and stands for the three-parameter Becke functional combined with the 



Lee- Yang-Parr correlation functional 5 



For the quantum mechanical calculations, the coordinates for the starting Flu conforma- 
tion were taken from the complex crystal structure with the best resolution (1.85 A 18]), 
which has the coordinates deposited in the RCSB Protein Data Bank 29J with entry name 
1FLR. Only the acidic deprotonated form of Flu was considered, since this is the active form 
in the experiments underlying the current study. The structure was energy optimized before 
charge fitting. The charges were then further refined by similarity to the set of already 
defined ones in the CHARMM force field (for details on the approach see 30|, |3l|, |32j). The 
ensuing Flu set of parameters was then used as an extension of the CHARMM parame- 
terisation, the corresponding CHARMM atom types and partial charges being indicated in 
Figure El 



D. Reference geometry 

The variable domains (Vl and Vh) were extracted from the L and H segments of the anti- 



Flu 4-4-20Fab 1FLR crystal structure 



18 



291 ] . Crystallographic waters were stripped from 



the structure and the C-terminal amino acids were capped with -NH 2 functional groups. A 
representation of the system is shown in Figure [31 The positions of the protein's missing 
hydrogens were initially guessed. Next, any latent close contacts or anomalous bonding 
positions were cleared out by relaxing the structure to an energy gradient tolerance of 0.05 
eV-A -1 , at the classical mechanics level using the NAMD program (33)] with the extended 
CHARMM parameterisation. This relaxed structure fully retains the experimental X-ray 
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conformational features and it was used as the starting conformation for the scanning. A 
full structure optimisation (i.e., using a tighter energy gradient tolerance) was also carried 
out but it introduced many small errors at the protein's secondary structure level. This 
is because secondary structure relies on a network of backbone hydrogen bonds, which are 
less accurately described in the framework of the simplified molecular mechanics force field 
theories. The CHARMM energy difference between the relaxed and fully minimised geome- 
tries is ~100 eV. The crystallographic structure itself has been resolved at a temperature of 
~290 K [l8|, thus an estimate of the corresponding average thermal energy (considering kgT 
per degree of freedom) amounts to ~270 eV for our simulation system. This indicates that 
the full minimisation is only reaching some local minima. Considering the above referred 
limitations of the force field, it is judicious to take the minimally relaxed structure (closer 
to the X-ray one) as the reference structure for the subsequent simulations. 

Flu is a particularly rigid molecule. Its essential degree of freedom is the torsion around 
the bond between the xanthenone and the carboxyphenyl aromatic rings (see Figure EJ), 
defining the angle between the planes of these two rings. This angle has the value of —63° 
in both the crystal complexed form 18] and the crystalline free Flu I24J]. For the above 



referred CHARMM energy relaxed structure, the value of this angle is —67°. An energy 
optimization was also carried out for the hapten alone (without the AB), the value for the 
angle in question being -62°. Moreover, the RMSD (root-mean-square deviation) between 
the crystallographic and relaxed Flu bound structures is 0.201 A. These results are a good 
indication of the validity of the derived set of CHARMM parameters. 

E. Distance scanning 

In the pursuit for the suitable reaction coordinates to describe the system's unbinding, the 



distance between the protein and_^ 
to some cluster fission processes 



lu mass-centres could be a first option, in close analogy 
l|. Yet for reasons that will become clear next, a distance 
between two rationally selected atoms has been considered instead. 

The shape of the binding pocket hosting Flu is most complementary to this hapten, 
with a few amino acids at the rim of the pocket gating the entrance. Superimposing 
the two available crystal structures results in an overall RMSD of 0.419 A for the Flu 
atoms and 1.854 A for the protein, with a few of those rim amino acids exhibiting some 
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FIG. 3: All-atom representation of the Fv-fragment of the mAb4-4-20-Flu complex structure. The 
ribbon representation highlights the backbone of the two chains, the H-chain in blue and the 
L-chain in gray. The Flu molecule is depicted in ball-and-stick and coloured by element (CPK 
space- filling). 



of the larger individual RMSD values (up to 3.7 A). Out of those, five amino acids - 
His31i, Asn33L, Tyr56#, Tyrl02# and Tyrl03# - strategically "frame" amino acid Arg39z, 
at the bottom of the pocket, as depicted in Figure HI Amino acids are labelled (in 
the text and Figure @J according to the standard amino acid 3-letter code ( |j] ; see also 
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Tyr103H Tyr102H 



FIG. 4: Fluorescein (van der Waals spheres in shades of grey) docked in the antibody's binding 
cavity. The backbone of the antibody is depicted in black sticks; the framing amino acid acids 
(labelled as described in the text) have their side-chains displayed in coloured ball-and-stick. (A) 
front view of the entrance of the cavity. (B) top view showing residue Arg39^ at the bottom of 
the cavity, with its atom (zeta-carbon, standard nomenclature) highlighted in yellow: the red 
dotted lines represent hydrogen bonding between Arg39^ and the hapten. 



http : / /www . chem . qmul . ac . uk/ iupac/| AminoAcid/), the number of the amino acid in the 
1FLR file and the chain identifier in subscript (e.g., Arg39i, refers to an Arginine that is 
numbered 39 in the L-chain). As highlighted in Figure 0]-B, Arg39^ has its +l-ionized group 
(centred on atom C^) directly involved in hydrogen bonding to the hydroxyl group of Flu 
(atoms 01-H12 in Figure |5J). Arg39i is also a mutation introduced during the matura- 



tion process of mAb4-4-20 



9J]: the original residue was a neutral, weakly polar Histidine. 



Upon this His-to-Ar 
imentally observed 



mutation a slowing in the unbinding of Flu by a 1.5-fold was exper- 



no doubt in consequence of the increased attraction between the 
mutated amino acid 39^ and Flu. Thus, it was only logical to consider the distance between 
the groups of Arg39^ and Flu engaged in that driving hydrogen bonding as a most likely 
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unbinding coordinate. 

The distance between the atom of Arg39^ and the Flu's hydroxyl oxygen (01) was 
then set as the appropriate coordinate for scanning. The scanning started from the distance 
in the reference conformation and progressed in increments of 0.25 A until a ~40 A distance. 
At this distance and for the set cut-off, the interaction energy between the hapten and the AB 
becomes zero. The scanning was also performed for a few decreasing steps, i.e., for distances 
smaller than the one in the reference structure. The distance value at each scanning step was 
imposed by means of a strong harmonic constraint (force constant = 26 eV-A -2 ) between 
the referred oxygen and a dummy atom placed at the same coordinates of the atom. The 
need for a dummy atom arises from the fact that, in CHARMM, the non-bonding energy of 
all atom pairs separated by less than three covalent bonds is excluded 20]; the introduced 
harmonic constraint is an "artificial bond" and therefore should not be directly set between 
the and OH atoms, otherwise several pair interactions between Arg39^ and Flu would 
be wrongly excluded. 

For each scanning step, the system was energy minimised with NAMD to an energy 
gradient tolerance < 4xl0~ 4 eV-A" 1 . A 12 A cut-off on long-range interactions with a 
switch smoothing function between 10 and 12 A was used. During minimisation, the hapten 
was free to move (subject only to the scanning harmonic constraint) while the AB was kept 
frozen for all but the side-chain atoms of a few key amino acids gating the passage of the 
hapten. The unconstrained side-chains belong to H1s31l, Asn33L, Arg52#, Tyr56#, Glu59#, 
Tyrl02j7 and Tyrl03#. The reported energy of each minimised structure was calculated after 
removing the referred harmonic constraint. In NAMD, it is not possible to set two different 
dielectrics within the same system, so minimisations were performed for e = 1, and solvent 
effects were introduced as corrections a posteriori, as described next. 

For the final conformation of each scanning step, the electrostatic energy was recalculated 
using the APBS program (refer to sub-section IIII Bf ) . The conformation of the last scanning 
step roughly occupies a 70 A-side cubic box. The side was extended by an extra 20 A for 
solvent media, resulting in a 90x90x90 A 3 box that was set equal for all scanning steps. 
Calculations were performed using the APBS' adaptive refinement j^]. A low dielectric 
constant of e = 2 was set for the macromolecule cavity 22j, |36j and the typical water value 



of e ~ 80 was set for the continuum solvent medium. The effect of a dilute electrolyte in 
solution was assessed with a second run of calculations, for a salt bulk concentration of 0.150 
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mol-dm 3 as in a typical physiological media 25j, and a temperature of 298 K. 



F. Exploring relative orientations 

The distance scanning scheme above described does not enforce an escaping channel 
along a straight line, nor does it restrain the AB-hapten relative orientations. For the sake 
of completion, a comprehensive overlook of the two molecules relative position and mutual 
orientation should be performed. The designated appropriate descriptors are the spherical 
coordinates (r, 9, 0) and three Euler angles (a, (3, 7), for which two coordinate frames 
are required. The referential frame, set as the protein's principal axes of moment of inertia 
given that the protein is fixed in space, and the moving-body local frame, i.e. the Flu frame. 
Care was taken to select this later, considering that during the scanning Flu does not evolve 
in space as a completely rigid body. Its centre was set in Flu's atom CI since along the 
scanning the position of Flu's mass centre is approximately coincident to this atom (0.2-0.4 
A RMSD). The xy plane was made coincident to the rigid xanthenone ring, with the x axis 
pointing in the direction of atom 01, as displayed in Figure [5j 




FIG. 5: System of internal axes for Flu. 
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G. Calculation of k 



off 



In compliance with the experimentally reported Arrhenius-like behaviour [7( and within 
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the context of the reaction-rate theory 37j, the off-rate constant along the scanned pathway 
was computed using the expression, 

k oR = u exp(- AE* /k B T) (3.6) 

where u is the pre-exponential factor determining how frequently the system approaches 
the barrier, and AE* is the activation energy (i.e., the barrier height). The harmonic 
approximation was used to estimate the Arrhenius-like pre-factor. It reads 

w = (3-7) 
Ztx 

where \i is the reduced mass of the system and k the harmonic force constant. This latter was 
obtained from parabolic fit of the data (the bounding region of the well in the energy profiles) 
using the Mathematica® software package. For systems similar to the one presented here 
(with reduced masses in the 200-500 range and binding pocket's length within 3-7 A), an 
estimation of the frequency uj falls in the 10 11 -10 12 s _1 range. 



IV. RESULTS AND DISCUSSION 
A. Energetic and structural analysis 

The energy profiles resulting from the scanning runs with and without solvent correction 
are plotted in Figures M and H 

Figure [6] displays the in vacuo results for four different scanning runs, corresponding to 
different constraining schemes on the protein atoms. The scheme referring to the seven 
unconstrained side-chains enumerated in sub-section IIII El has been labelled 'cl' in Figure [61 
To better assess on the influence of those 7 amino acids on the escaping profile, they were 
subject to successive constraining procedures, exemplified in Figure[6]for three representative 
cases, labelled 'c2', ! c3' and 'c4', that correspond to six, five and four unconstrained side 
chains (out of the initial seven). The plotted 'c3' curve, for instance, results from moving 
only the side-chains of the 5 gating residues sginaled out in the second paragraph of sub- 
section [TTTE] and visible in Figure 0]-B. 
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FIG. 6: Energy profiles (in vacuo) for different distance scanning runs, corresponding to different 
constraining schemes on the protein atoms. Each curve corresponds to a different number of gating 
amino acid side-chains that were allowed to move during each scanning, namely seven side chains 
(cl), six (c2) five (c3) and four (c4) (details in the text). 

The constraining limit is the set of amino acids Asn33i, Tyr56#, Tyrl02# and Tyrl03# 
corresponding to the 'c4' curve. Within this limit, no general significant differences on 
the energy profile arise from the explored different schemes. These 4 amino acids always 
experience significant conformational changes upon the hapten's passage, in comparison to 
the remaining moving ones which just slightly adjust positioning. The plane denned by the 
side-chain oxygens of the 4 amino acids in question can be taken as the outmost limit of the 
protein's pocket, and it is intersected at a ~15 A scanning distance. Below this separation 
distance, the total energy plots in Figure [6] depict the expected profile for an activated 
process. For the different curves, the height and shape of the energetic barrier at ~7 A is 
essentially the same: 1.029, 1.027, 1.060 and 1.026 eV respectively for 7, 6, 5 and 4 moving 
side-chains. Past the 15 A distance, the in vacuo profiles depict an asymptotic increase to a 
final plateau above the referred energetic barrier, making unbinding unfeasible. Predictably, 
the inclusion of solvent effects rectify the asymptotic behaviour depicted in the in vacuo 
profiles, as exemplified in Figure [7] for two scanning runs. At larger separation distances 
the energy profile has been significantly flattened, ant it is also for the larger distances that 
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FIG. 7: Comparison of the distance scanning energy profiles in vacuo and with implicit solvent 
corrections (with and without dissolved electrolyte), for the 'c2' and 'c3' constraining schemes. 

the effect of the dissolved electrolyte becomes perceptible. In solution, the electrostatic 
interactions between the protein and the escaping hapten are effectively screened by the 
high-dielectric, allowing for unbinding to happen. Of relevance is also the decrease in the 
height of the energetic barrier at ~7 A: with implicit solvent effects, this barrier value is 
0.863 and 0.871 eV, respectively for the 'c2' and 'c3' schemes. 

The jagged contour emerging at ~20 A also deserves some attention. A detailed analysis 
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FIG. 8: Electrostatic energy profiles (in vacuo) for the overall system, its intermolecular component 
and the interaction energy between the protein and the COO~ group (atoms C20, 04, 05 of the 
Flu's carboxyphenyl ring; see Figure [2]). The picture on top portraits the path of that ring along 
the scanning: the dashed yellow straight line puts in evidence the coordinate being scanned while 
the circle emphasises the anchor point of the COO - group at the protein' surface (see details in 
the text). The grey area puts in evidence a jagged region in the profile. 

of the energetic components was carried out, with particular emphasis on the Coulombic 
component, as exemplified in Figure [8] for scanning run 'c2'. A previous work had already 
shown that, at larger distances, electrostatics play a major role in fragmentation si]. By 
comparing Figures [7] and [HI one perceives the jagged pattern similarities between the total 
energy and its electrostatic component. Moreover, the electrostatic energy and its inter- 
molecular component - i.e., the electrostatic interaction energy between the protein and the 
hapten - run parallel. To this intermolecular energy, a major contribution comes from the 
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COO" group of Flu. This group gets anchored via hydrogen-bonds at the protein's surface 
as the hapten leaves the pocket: the anchor point is yellow circled in the top picture of Fig- 
ure El The hapten rotates around this point as the scanning distance is further increased, 
until it finally detaches from the surface. The detachment features a somewhat irregular 
trajectory of the escaping hapten: it is the region of the top picture in Figure El right above 
the grey area highlighting the jagged contour in the plot. A better perception of the hapten's 
rotation as it leaves the binding pocket can be gained from the plotting of the Euler angles 
along the scanning, presented in Figure [9j The steeper variation of the angles in the 15-20 
A region corresponds to the anchoring track of the COO" group at the protein' surface. As 
the hapten detaches, a swift change in its orientation is observed, made evident by the plots 
for the Euler angles from ~20 A on. At this stage, one can not ascertain whether or not 
this pronounced "trapping" of the hapten to the protein' surface is a genuine feature of the 
unbinding. That would at least require the other known mutations of the anti-fluorescein 
mAb4-4-20 to be subject to an analogous study, which is beyond the scope of the present 
paper. 

The fact remains that, on the overall, the total energy profiles are smooth (without 
discontinuities), as clear from the top plot in Figure [10] of the energy as a function of both 
the scanning coordinate and the radial distance (r). They portray a plausible unbinding 
channel, provided solvent effects are included, though one can not claim that they correspond 
to the minimum energy pathway. One possible step to assess that would be to perform a 
more comprehensive probing of the positional/orientational space of the hapten - beyond the 
points determined by the presently selected reaction coordinate. Yet, such a study involves 
a substantial computational effort, even if confined to some plausible escaping window in 
space. On the other hand, the presently computed profiles can be used to derive k Q g, and by 
comparison to the corresponding experimental values(s), a first evaluation of the scanning 
approach here introduced can be made. 

B. k g determination 

Table I presents the calculated values of k g, with and without solvent correction, re- 
sulting from parabolic fit to the profiles (0.99 < R 2 > 0.87), considering the energy barrier 
at ~7 A (vide supra). Experimentally available k Q g values are also presented for compari- 
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FIG. 9: Euler angles as a function of the scanned distance coordinate. 



son. It becomes immediately evident that, even for an extensively studied system like the 
mAb4-4-20-ffuorescein one, experimental fc fr values may differ by an order of magnitude, 



depending on setup conditions and techniques 



39] . As for our estimated values, while 



the in vacuo results are off-range, the solvent-corrected ones are comparable to the exper- 
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FIG. 10: Spherical coordinates along the 'c2' scanning run. The total energy of the system is 
plotted as a function of both the radial distance (r) and the scanned distance coordinate. The 
angles 6 and (j) are plotted only as a function of the scanning distance coordinate. 

imental results. The equilibrium distance between the antibody and the hapten (the well 
minimum) also compares better to the experimental value in the case of the solvent-corrected 
simulations. Finally, remark that the different constraining schemes have little influence on 
the order of magnitude of the k oS values. 
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TABLE I: Kinetic and equilibrium parameters obtained from calculations based on the compu- 
tational scanning. Available experimental values are also presented for comparison: (a) and (c) 



determined in solution fref. 



3J and 



391 ] respectively), and (b) at a surface by SPR [9]. 



parameter 


simulations 


experimental 


T(K) 


in vacuo 


solvent corrected 




c2 


c3 


c2 


c3 






&off(s -1 ) 


3.4 x 1(T 6 


6.8 x 10" 6 


4.1 x 10~ 3 


5.4 x 10" 3 


1.9 x 10" 3 ( a) 
6.8 x 10~ 3 ( 6 ) 
4.3 x 10~ 3 - 2.5 x 10~ 2 W 


291 

298 
298 


equilibrium 
distance (A) 


3.50 
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V. CONCLUDING REMARKS 



Here we presented our first attempt to describe the unbinding of a complex biomolecular 
system in terms of a reduced set of relevant generalized coordinates while restricting most 
of its conformational internal degrees of freedom. The reported results open a practical 
and physically sound procedure to compute energy profiles along the selected reaction co- 
ordinate^). This was demonstrated in the present work for an experimentally well studied 
complex of the biologically relevant antigen- antibody system. For the example in question, it 
was actually possible to find a distance dependent escaping channel in the multidimensional 
potential energy landscape, thus reducing the unbinding to a low- dimensional problem: the 
system seems to be efficiently bound by this one distance coordinate. The effect of the 
solvent was also accounted for, and despite the fact that it was introduced as a correction a 
posteriori, it allowed us to ascertain that this is one effect that needs to be included, for it 
has a significant influence in the overall energetic profile and subsequent parameters derived 
from it. With solvent effects, the derived off-rates are in reasonable agreement with the 
experimentally determined ones, a result that can be regarded as an indicator that ours is 
indeed a realistic approach. 

The proposed approach would no doubt benefit from further refinements, namely in the 
way solvent effects are introduced (viz. include them during scanning, both implicit and 
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explicitly) and in the kinetic model, which we intend to carry on in the near future. We 
also have in mind to apply this same approach to the maturation series and engineered 
mutants of the 4-4-20-fiuorescein complex. That would allow us to further test our strategy, 
in particular its sensitiveness to the energetic differences arising from antibody's single-point 
mutations. In addition, computed association rates would be valuable parameters in different 
areas of immunological research, namely theoretical immunology 40j. Remark also that 
calculated k Q s values could be used to determine the related association rate k on using the 
relation k on = k g/K^Q\, for those systems where only the equilibrium dissociation constant 
Kd has been experimentally measured. And by identifying and rationalize the involved key 
structural features and interactions determining the unbinding, one could make insightful 
predictions and propose, for instance, affinity-enhancing mutations. Our long term goal is 
to extend our research to other molecular recognition processes besides the antigen-antibody 
one and to test the applicability and universality of our approach. 
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